AITopics | evaluation game

Collaborating Authors

evaluation game

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

df308fd90635b28d82558cf580c73ed9-AuthorFeedback.pdf

Neural Information Processing SystemsAug-20-2025, 05:55:56 GMT

agent, evaluation game, listener, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.99)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.30)

Add feedback

Learning Meta Representations for Agents in Multi-Agent Reinforcement Learning

Zhang, Shenao, Shen, Li, Han, Lei, Shen, Li

arXiv.org Artificial IntelligenceJun-5-2023

In multi-agent reinforcement learning, the behaviors that agents learn in a single Markov Game (MG) are typically confined to the given agent number. Every single MG induced by varying the population may possess distinct optimal joint strategies and game-specific knowledge, which are modeled independently in modern multi-agent reinforcement learning algorithms. In this work, our focus is on creating agents that can generalize across population-varying MGs. Instead of learning a unimodal policy, each agent learns a policy set comprising effective strategies across a variety of games. To achieve this, we propose Meta Representations for Agents (MRA) that explicitly models the game-common and game-specific strategic knowledge. By representing the policy sets with multi-modal latent policies, the game-common strategic knowledge and diverse strategic modes are discovered through an iterative optimization procedure. We prove that by approximately maximizing the resulting constrained mutual information objective, the policies can reach Nash Equilibrium in every evaluation MG when the latent space is sufficiently large. When deploying MRA in practical settings with limited latent space sizes, fast adaptation can be achieved by leveraging the first-order gradient information. Extensive experiments demonstrate the effectiveness of MRA in improving training performance and generalization ability in challenging evaluation games.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2108.12988

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.47)

Add feedback

Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates

Soemers, Dennis J. N. J., Piette, Éric, Stephenson, Matthew, Browne, Cameron

arXiv.org Machine LearningMay-14-2019

In recent years, state-of-the-art game-playing agents often involve policies that are trained in self-playing processes where Monte Carlo tree search (MCTS) algorithms and trained policies iteratively improve each other. The strongest results have been obtained when policies are trained to mimic the search behaviour of MCTS by minimising a cross-entropy loss. Because MCTS, by design, includes an element of exploration, policies trained in this manner are also likely to exhibit a similar extent of exploration. In this paper, we are interested in learning policies for a project with future goals including the extraction of interpretable strategies, rather than state-of-the-art game-playing performance. For these goals, we argue that such an extent of exploration is undesirable, and we propose a novel objective function for training policies that are not exploratory. We derive a policy gradient expression for maximising this objective function, which can be estimated using MCTS value estimates, rather than MCTS visit counts. We empirically evaluate various properties of resulting policies, in a variety of board games.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

arXiv.org Machine Learning

1905.05809

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
(2 more...)

Add feedback

Biasing MCTS with Features for General Games

Soemers, Dennis J. N. J., Piette, Éric, Browne, Cameron

arXiv.org Artificial IntelligenceMar-21-2019

This paper proposes using a linear function approximator, rather than a deep neural network (DNN), to bias a Monte Carlo tree search (MCTS) player for general games. This is unlikely to match the potential raw playing strength of DNNs, but has advantages in terms of generality, interpretability and resources (time and hardware) required for training. Features describing local patterns are used as inputs. The features are formulated in such a way that they are easily interpretable and applicable to a wide range of general games, and might encode simple local strategies. We gradually create new features during the same self-play training process used to learn feature weights. We evaluate the playing strength of an MCTS player biased by learnt features against a standard upper confidence bounds for trees (UCT) player in multiple different board games, and demonstrate significantly improved playing strength in the majority of them after a small number of self-play training games.

artificial intelligence, machine learning, self-play game num, (18 more...)

arXiv.org Artificial Intelligence

1903.08942

Country:

Europe > Netherlands > Limburg > Maastricht (0.04)
Oceania > New Zealand > North Island > Waikato > Hamilton (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Experiments on the Acquisition of Cognitive and Linguistic Competence to Communicate Propositional Logic Sentences

Sierra, Josefina (Technical University of Catalonia) | Santibanez, Josefina (University of La Rioja)

AAAI ConferencesNov-3-2009

We describe some experiments which simulate a grounded approach to the acquisition of the cognitive and linguistic competence required to communicate propositional logic sentences. This encompasses both the construction of a conceptualisation of its environment by each individual agent and of a shared language by the population. The processes of conceptualisation and language acquisition in each individual agent are based on general purpose cognitive capacities, such as categorisation, discrimination, invention, adoption and induction. The construction of a shared language by the population is achieved using a particular type of linguistic interaction, known as the evaluation game, which gives rise to a common set of linguistic conventions through a process of self-organisation. This work addresses the problem of the acquisition of both the semantics and the syntax of propositional logic. Trying to learn these two aspects at the same time is more difficult than learning the semantics or the syntax of propositional logic separately. Because the agents must coordinate their linguistic behaviour taking into account only the subset of objects which constitutes the topic of a particular linguistic interaction. This means that a pair of agents can communicate successfully about a particular subset of objects (a topic) even if they use different conceptualisations (formulas) in order to identify the same topic. And this introduces a high degree of ambiguity in the interpretation process the agents have to deal with when they try to construct a shared communication language. In spite of this, the results of the experiments show that at the end of the simulation runs the individual agents build different conceptualisations and grammars, but that the conceptualisations and grammars of the agents in the population are compatible in the sense that they guarantee the unambiguous communication of propositional logic sentences.

artificial intelligence, category, logic & formal reasoning, (17 more...)

AAAI Conferences

2009 AAAI Fall Symposium Series

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Mexico > Puebla (0.04)
Europe > Spain > La Rioja (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback